9 research outputs found

    Phonetic inventory for an Arabic speech corpus

    No full text
    Corpus design for speech synthesis is a well-researched topic in languages such as English compared to Modern Standard Arabic, and there is a tendency to focus on methods to automatically generate the orthographic transcript to be recorded (usually greedy methods). In this work, a study of Modern Standard Arabic (MSA) phonetics and phonology is conducted in order to create criteria for a greedy meth-od to create a speech corpus transcript for recording. The size of the dataset is reduced a number of times using these optimisation methods with different parameters to yield a much smaller dataset with identical phonetic coverage than before the reduction, and this output transcript is chosen for recording. This is part of a larger work to create a completely annotated and segmented speech corpus for MSA

    Arabic Speech Corpus

    Get PDF

    Arabic/English symbol dictionary: early challenges and technological opportunities

    No full text
    Over the last ten years there has been an expansion in the number of symbol sets available to Augmentative and Alternative Communication (AAC) users, their therapists, teachers and carers. They have tended to be developed in USA or Europe with English or European language word lists, although some have other language options including Arabic. The problem is that few show the traits of true localisation where solutions have to be found for “the differences between cultures and the problems that are likely to occur because of these differences” (Evers et al., 2000). Researchers have shown in relation to symbol use for communication that it is important to have: • translucency (How appropriate is a proposed symbol for a suggested meaning?) (Bloomberg et al. 1990),• guessability (Can subjects guess the intended meaning of a symbol?) (Hanson & Hartzema 1995, Dowse & Ehlers 2001,2003), and• iconicity (How distinctive are the symbols?) (Haupt & Alant 2003).Simple language translations may offer word for word matching within the lexicons, but they tend to miss the issues of local colloquial vocabulary, cultural, social and environmental differences which can all impact on the speed of communication especially when using many inappropriate icons, pictorgrams and other types of imagery to support dialogue and literacy skills

    Modern standard Arabic phonetics for speech synthesis

    No full text
    Arabic phonetics and phonology have not been adequately studied for the purposes of speech synthesis and speech synthesis corpus design. The only sources of knowledge available are either archaic or targeted towards other disciplines such as education. This research conducted a three-stage study. First, Arabic phonology research was reviewed in general, and the results of this review were triangulated with expert opinions – gathered throughout the project – to create a novel formalisation of Arabic phonology for speech synthesis.Secondly, this formalisation was used to create a speech corpus in Modern Standard Arabic and this corpus was used to produce a speech synthesiser. This corpus was the first to be constructed and published for this dialect of Arabic using scientifically-supported phonological formalisms. The corpus was semi-automatically annotated with phoneme boundaries and stress marks; it is word-aligned with the orthographical transcript. The accuracy of these alignments was compared with previous published work, which showed that even slightly less accurate alignments are sufficient for producing high quality synthesis.Finally, objective and subjective evaluations were conducted to assess the quality of this corpus. The objective evaluation showed that the corpus based on the proposed phonological formalism had sufficient phonetic coverage compared with previous work. The subjective evaluation showed that this corpus can be used to produce high quality parametric and unit selection speech synthesisers. In addition, it showed that the use of orthographically extracted stress marks can improve the quality of the generated speech for general purpose synthesis. These stress marks are the first to be tested for Modern Standard Arabic, which thus opens this subject for future research

    Modern standard Arabic phonetics for speech synthesis

    No full text
    Arabic phonetics and phonology have not been adequately studied for the purposes of speech synthesis and speech synthesis corpus design. The only sources of knowledge available are either archaic or targeted towards other disciplines such as education. This research conducted a three-stage study. First, Arabic phonology research was reviewed in general, and the results of this review were triangulated with expert opinions – gathered throughout the project – to create a novel formalisation of Arabic phonology for speech synthesis.Secondly, this formalisation was used to create a speech corpus in Modern Standard Arabic and this corpus was used to produce a speech synthesiser. This corpus was the first to be constructed and published for this dialect of Arabic using scientifically-supported phonological formalisms. The corpus was semi-automatically annotated with phoneme boundaries and stress marks; it is word-aligned with the orthographical transcript. The accuracy of these alignments was compared with previous published work, which showed that even slightly less accurate alignments are sufficient for producing high quality synthesis.Finally, objective and subjective evaluations were conducted to assess the quality of this corpus. The objective evaluation showed that the corpus based on the proposed phonological formalism had sufficient phonetic coverage compared with previous work. The subjective evaluation showed that this corpus can be used to produce high quality parametric and unit selection speech synthesisers. In addition, it showed that the use of orthographically extracted stress marks can improve the quality of the generated speech for general purpose synthesis. These stress marks are the first to be tested for Modern Standard Arabic, which thus opens this subject for future research

    A web based multi-linguists symbol-to-text AAC application

    No full text

    Bay 13 pecha kucha

    No full text
    The talks are by EA Draffan, Nawar Halabi, Gareth Beeston and Neil Rogers. In 6m40s and 20 slides, each member of Bay 13 will introduce themselves, explaining their background and research interests, so those in WAIS can put a name to a face, and chat after the event if there are common interests

    Synergistic teamwork using social media for innovative development of an Arabic symbol dictionary

    No full text
    The use of social media and online systems has provided for participatory research to be undertaken between researchers, therapists and Augmentative and Alternative Communication (AAC) users to ensure the development of meaningful symbols and a core vocabulary for use in an online Arabic Symbol Dictionary. Data gathering and interviews using a series of collaborative systems has enhanced the cultural understanding between therapists working with these children and adults and researchers. Bespoke systems have also promoted the building of a synergistic team able to respond speedily to users’ needs, data collation and analysis as well as collaboratively solving problems that arise. The different types of social media have impacted on the research and caused the team to reflect on the way they have affected outcomes

    Implementing Widely-used Vocabularies to Produce Linked Open Data in the Context of Open Repositories

    Get PDF
    Presentation at Open Repositories 2014, Helsinki, Finland, June 9-13, 2014EPrints Interest Group PresentationsThe Food and Agriculture Organization (FAO) of the United Nations works to provide support to agricultural information communities to build and maintain open repositories that meet recommended metadata standards and controlled vocabularies. To this end, FAO set up partnerships with organizations working in this field. The most recent work focus on the implementation of an authority tool in the submission process of Eprints Software . Its purpose is to facilitate the use of controlled vocabularies published as Linked Open Data, and the exposure of data on the Semantic Web . This project has been technically implemented by the University of Southampton in partnership with FAO, UNESCO-IOC/IODE and Hasselt University Library .Halabi, Nawar (University of Southampton)Leinders, Dirk (Hasselt University Library)Goovaerts, Marc (Hasselt University Library)Well, Andrew (University of Southampton)Subirats Coll, Imma (FAO of the United Nations, Italy
    corecore